Serveur d'exploration sur la recherche en informatique en Lorraine

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

On noise masking for automatic missing data speech recognition : A survey and discussion

Identifieur interne : 004D47 ( Main/Exploration ); précédent : 004D46; suivant : 004D48

On noise masking for automatic missing data speech recognition : A survey and discussion

Auteurs : Christophe Cerisara [France] ; Sébastien Demange [France] ; Jean-Paul Haton [France]

Source :

RBID : Francis:09-0009259

Descripteurs français

English descriptors

Abstract

Automatic speech recognition (ASR) has reached very high levels of performance in controlled situations. However, the performance degrades significantly when environmental noise occurs during the recognition process. Nowadays, the major challenge is to reach a good robustness to adverse conditions, so that automatic speech recognizers can be used in real situations. Missing data theory is a very attractive and promising approach. Unlike other denoising methods, missing data recognition does not match the whole data with the acoustic models, but instead considers part of the signal as missing, i.e. corrupted by noise. While speech recognition with missing data can be handled efficiently by methods such as data imputation or marginalization, accurately identifying missing parts (also called masks) remains a very challenging task. This paper reviews the main approaches that have been proposed to address this problem. The objective of this study is to identify the mask estimation methods that have been proposed so far, and to open this domain up to other related research, which could be adapted to overcome this difficult challenge. In order to restrict the range of methods, only the techniques using a single microphone are considered.

Url:


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en" level="a">On noise masking for automatic missing data speech recognition : A survey and discussion</title>
<author>
<name sortKey="Cerisara, Christophe" sort="Cerisara, Christophe" uniqKey="Cerisara C" first="Christophe" last="Cerisara">Christophe Cerisara</name>
<affiliation wicri:level="3">
<inist:fA14 i1="01">
<s1>LORIA, UMR 7503</s1>
<s2>Nancy</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>France</country>
<placeName>
<region type="region">Grand Est</region>
<region type="old region">Lorraine (région)</region>
<settlement type="city">Nancy</settlement>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Demange, Sebastien" sort="Demange, Sebastien" uniqKey="Demange S" first="Sébastien" last="Demange">Sébastien Demange</name>
<affiliation wicri:level="3">
<inist:fA14 i1="01">
<s1>LORIA, UMR 7503</s1>
<s2>Nancy</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>France</country>
<placeName>
<region type="region">Grand Est</region>
<region type="old region">Lorraine (région)</region>
<settlement type="city">Nancy</settlement>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Haton, Jean Paul" sort="Haton, Jean Paul" uniqKey="Haton J" first="Jean-Paul" last="Haton">Jean-Paul Haton</name>
<affiliation wicri:level="3">
<inist:fA14 i1="01">
<s1>LORIA, UMR 7503</s1>
<s2>Nancy</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>France</country>
<placeName>
<region type="region">Grand Est</region>
<region type="old region">Lorraine (région)</region>
<settlement type="city">Nancy</settlement>
</placeName>
<placeName>
<settlement type="city">Nancy</settlement>
<region type="region" nuts="2">Grand Est</region>
<region type="region" nuts="2">Lorraine (région)</region>
</placeName>
<orgName type="laboratoire" n="5">Laboratoire lorrain de recherche en informatique et ses applications</orgName>
<orgName type="university">Université de Lorraine</orgName>
<orgName type="institution">Centre national de la recherche scientifique</orgName>
<orgName type="institution">Institut national de recherche en informatique et en automatique</orgName>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">INIST</idno>
<idno type="inist">09-0009259</idno>
<date when="2007">2007</date>
<idno type="stanalyst">FRANCIS 09-0009259 INIST</idno>
<idno type="RBID">Francis:09-0009259</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000298</idno>
<idno type="wicri:Area/PascalFrancis/Curation">000733</idno>
<idno type="wicri:Area/PascalFrancis/Checkpoint">000297</idno>
<idno type="wicri:explorRef" wicri:stream="PascalFrancis" wicri:step="Checkpoint">000297</idno>
<idno type="wicri:doubleKey">0885-2308:2007:Cerisara C:on:noise:masking</idno>
<idno type="wicri:Area/Main/Merge">004E83</idno>
<idno type="wicri:source">HAL</idno>
<idno type="RBID">Hal:inria-00160554</idno>
<idno type="url">https://hal.inria.fr/inria-00160554</idno>
<idno type="wicri:Area/Hal/Corpus">003726</idno>
<idno type="wicri:Area/Hal/Curation">003726</idno>
<idno type="wicri:Area/Hal/Checkpoint">003D20</idno>
<idno type="wicri:explorRef" wicri:stream="Hal" wicri:step="Checkpoint">003D20</idno>
<idno type="wicri:doubleKey">0885-2308:2007:Cerisara C:on:noise:masking</idno>
<idno type="wicri:Area/Main/Merge">005011</idno>
<idno type="wicri:Area/Main/Curation">004D47</idno>
<idno type="wicri:Area/Main/Exploration">004D47</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a">On noise masking for automatic missing data speech recognition : A survey and discussion</title>
<author>
<name sortKey="Cerisara, Christophe" sort="Cerisara, Christophe" uniqKey="Cerisara C" first="Christophe" last="Cerisara">Christophe Cerisara</name>
<affiliation wicri:level="3">
<inist:fA14 i1="01">
<s1>LORIA, UMR 7503</s1>
<s2>Nancy</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>France</country>
<placeName>
<region type="region">Grand Est</region>
<region type="old region">Lorraine (région)</region>
<settlement type="city">Nancy</settlement>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Demange, Sebastien" sort="Demange, Sebastien" uniqKey="Demange S" first="Sébastien" last="Demange">Sébastien Demange</name>
<affiliation wicri:level="3">
<inist:fA14 i1="01">
<s1>LORIA, UMR 7503</s1>
<s2>Nancy</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>France</country>
<placeName>
<region type="region">Grand Est</region>
<region type="old region">Lorraine (région)</region>
<settlement type="city">Nancy</settlement>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Haton, Jean Paul" sort="Haton, Jean Paul" uniqKey="Haton J" first="Jean-Paul" last="Haton">Jean-Paul Haton</name>
<affiliation wicri:level="3">
<inist:fA14 i1="01">
<s1>LORIA, UMR 7503</s1>
<s2>Nancy</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>France</country>
<placeName>
<region type="region">Grand Est</region>
<region type="old region">Lorraine (région)</region>
<settlement type="city">Nancy</settlement>
</placeName>
<placeName>
<settlement type="city">Nancy</settlement>
<region type="region" nuts="2">Grand Est</region>
<region type="region" nuts="2">Lorraine (région)</region>
</placeName>
<orgName type="laboratoire" n="5">Laboratoire lorrain de recherche en informatique et ses applications</orgName>
<orgName type="university">Université de Lorraine</orgName>
<orgName type="institution">Centre national de la recherche scientifique</orgName>
<orgName type="institution">Institut national de recherche en informatique et en automatique</orgName>
</affiliation>
</author>
</analytic>
<series>
<title level="j" type="main">Computer speech & language : (Print)</title>
<title level="j" type="abbreviated">Comput. speech lang. : (Print)</title>
<idno type="ISSN">0885-2308</idno>
<imprint>
<date when="2007">2007</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt>
<title level="j" type="main">Computer speech & language : (Print)</title>
<title level="j" type="abbreviated">Comput. speech lang. : (Print)</title>
<idno type="ISSN">0885-2308</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Computational linguistics</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr">
<term>Linguistique informatique</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Automatic speech recognition (ASR) has reached very high levels of performance in controlled situations. However, the performance degrades significantly when environmental noise occurs during the recognition process. Nowadays, the major challenge is to reach a good robustness to adverse conditions, so that automatic speech recognizers can be used in real situations. Missing data theory is a very attractive and promising approach. Unlike other denoising methods, missing data recognition does not match the whole data with the acoustic models, but instead considers part of the signal as missing, i.e. corrupted by noise. While speech recognition with missing data can be handled efficiently by methods such as data imputation or marginalization, accurately identifying missing parts (also called masks) remains a very challenging task. This paper reviews the main approaches that have been proposed to address this problem. The objective of this study is to identify the mask estimation methods that have been proposed so far, and to open this domain up to other related research, which could be adapted to overcome this difficult challenge. In order to restrict the range of methods, only the techniques using a single microphone are considered.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>France</li>
</country>
<region>
<li>Grand Est</li>
<li>Lorraine (région)</li>
</region>
<settlement>
<li>Nancy</li>
</settlement>
<orgName>
<li>Centre national de la recherche scientifique</li>
<li>Institut national de recherche en informatique et en automatique</li>
<li>Laboratoire lorrain de recherche en informatique et ses applications</li>
<li>Université de Lorraine</li>
</orgName>
</list>
<tree>
<country name="France">
<region name="Grand Est">
<name sortKey="Cerisara, Christophe" sort="Cerisara, Christophe" uniqKey="Cerisara C" first="Christophe" last="Cerisara">Christophe Cerisara</name>
</region>
<name sortKey="Demange, Sebastien" sort="Demange, Sebastien" uniqKey="Demange S" first="Sébastien" last="Demange">Sébastien Demange</name>
<name sortKey="Haton, Jean Paul" sort="Haton, Jean Paul" uniqKey="Haton J" first="Jean-Paul" last="Haton">Jean-Paul Haton</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Lorraine/explor/InforLorV4/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 004D47 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 004D47 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Lorraine
   |area=    InforLorV4
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     Francis:09-0009259
   |texte=   On noise masking for automatic missing data speech recognition : A survey and discussion
}}

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Mon Jun 10 21:56:28 2019. Site generation: Fri Feb 25 15:29:27 2022